69 research outputs found
DisDiff: Unsupervised Disentanglement of Diffusion Probabilistic Models
Targeting to understand the underlying explainable factors behind
observations and modeling the conditional generation process on these factors,
we connect disentangled representation learning to Diffusion Probabilistic
Models (DPMs) to take advantage of the remarkable modeling ability of DPMs. We
propose a new task, disentanglement of (DPMs): given a pre-trained DPM, without
any annotations of the factors, the task is to automatically discover the
inherent factors behind the observations and disentangle the gradient fields of
DPM into sub-gradient fields, each conditioned on the representation of each
discovered factor. With disentangled DPMs, those inherent factors can be
automatically discovered, explicitly represented, and clearly injected into the
diffusion process via the sub-gradient fields. To tackle this task, we devise
an unsupervised approach named DisDiff, achieving disentangled representation
learning in the framework of DPMs. Extensive experiments on synthetic and
real-world datasets demonstrate the effectiveness of DisDiff.Comment: Accepted by NeurIPS 202
Super-NeRF: View-consistent Detail Generation for NeRF super-resolution
The neural radiance field (NeRF) achieved remarkable success in modeling 3D
scenes and synthesizing high-fidelity novel views. However, existing NeRF-based
methods focus more on the make full use of the image resolution to generate
novel views, but less considering the generation of details under the limited
input resolution. In analogy to the extensive usage of image super-resolution,
NeRF super-resolution is an effective way to generate the high-resolution
implicit representation of 3D scenes and holds great potential applications. Up
to now, such an important topic is still under-explored. In this paper, we
propose a NeRF super-resolution method, named Super-NeRF, to generate
high-resolution NeRF from only low-resolution inputs. Given multi-view
low-resolution images, Super-NeRF constructs a consistency-controlling
super-resolution module to generate view-consistent high-resolution details for
NeRF. Specifically, an optimizable latent code is introduced for each
low-resolution input image to control the 2D super-resolution images to
converge to the view-consistent output. The latent codes of each low-resolution
image are optimized synergistically with the target Super-NeRF representation
to fully utilize the view consistency constraint inherent in NeRF construction.
We verify the effectiveness of Super-NeRF on synthetic, real-world, and
AI-generated NeRF datasets. Super-NeRF achieves state-of-the-art NeRF
super-resolution performance on high-resolution detail generation and
cross-view consistency
- …